Feasibility, acceptability, and bacterial recovery for community-based sample collection to estimate antibiotic resistance in commensal gut and upper respiratory tract bacteria

Vietnam has high rates of antibiotic use and resistance. Measuring resistance in commensal bacteria could provide an objective indicator for evaluating the impact of interventions to reduce antibiotic use and resistance. This study aimed to evaluate the feasibility, acceptability, and bacterial recovery for different sampling strategies. We conducted a cross-sectional mixed methods study in a rural community in Ha Nam Province, northern Vietnam, and collected structured interviews, samples, and in-depth interviews from households. Out of 389 households invited, 324 participated (83%), representing 1502 individuals. Samples were collected from these individuals (1498 stool, 1002 self-administered nasal swabs, and 496 HW-administered nasopharyngeal swabs). Pneumococci were recovered from 11.1% (128/1149) of the total population and 26.2% (48/183) of those under 5-years. Recovery was higher for health-worker (HW)-administered swabs (13.7%, 48/350) than self-administered swabs (10.0%, 80/799) (OR 2.06, 95% CI 1.07–3.96). Cost per swab was cheaper for self-administered ($7.26) than HW-administered ($8.63) swabs, but the overall cost for 100 positive samples was higher ($7260 and $6300 respectively).﻿ Qualitative interviews revealed that HW-administered nasopharyngeal swabs took longer to collect, caused more discomfort, and were more difficult to take from children. Factors affecting participation included sense of contribution, perceived trade-offs between benefits and effort, and peer influence. Reluctance was related to stool sampling and negative perceptions of research. This study provides important evidence for planning community-based carriage studies, including cost, logistics, and acceptability. Self-administered swabs had lower recovery, and though cheaper and quicker, this would translate to higher costs for large population-based studies. Recovery might be improved by swab-type, transport medium, and better cold-chain to lab.

www.nature.com/scientificreports/ A list of all households registered in the study area was obtained from the commune health centre and was used as the sampling frame. No further eligibility criteria were applied. Households with children under 5-years old were oversampled, in order to capture higher levels of pneumococcal carriage. 389 households were selected in total using the runiform command in Stata V14: 93 without children under 5-years, and 296 with children under 5-years old (Fig. 1).
Data from this study was expected to inform sample sizes for future larger community-based carriage studies, and estimates of many of the outcome indicators were not known. We assumed that 11% of the whole population would carry S. pneumoniae , based on a study in Vietnam that used a similar ratio of oversampling households with children under 5-years old 17 . There were three arms (see below), and a minimum sample of 499 per arm would be needed to detect a recovery rate 45% lower in one of the self-swabbing arms (i.e. 6.1% recovery) with 80% power. This represents a practically important difference. We randomly selected and recruited households until 500 participants were enrolled in each group. For all other prevalence estimates, using the most conservative value of 50%, we also reached the minimum of 385 per group to estimate with a margin of error of 5%, and a 95% confidence level. Carriage of Enterobacterales would be ubiquitous, and we would have sufficient precision to later estimate resistance among these isolates.

Sample collection and laboratory analysis.
Four health-workers from the local CHC were trained in data collection and nasopharyngeal swab collection. A list of selected households was given to health-workers, and potential participants were invited to participate. Health-workers visited households at an agreed time and collected written informed consent. Socio-demographic and behavioural information was collected by the www.nature.com/scientificreports/ health-workers, using structured questionnaires. All members of participating households were provided with containers and spoons for the collection of stool. There were initially three swabbing groups: self-administered nasal swabs -Amies with charcoal gel-based transport medium with rigid cotton-tipped swab (Copan M40 414C Transystem™); self-administered nasal swabs -Amies without charcoal liquid transport medium with flexible minitip Nylon ® flocked swab (Copan eSwab™); and HW-administered nasopharyngeal swabs -Amies without charcoal liquid transport medium with flexible minitip Nylon ® flocked swab (Copan eSwab™) (Fig. 1). However, part way through the study we discovered a problem with the liquid transport medium, and the transport medium in those two arms was replaced with the Amies with charcoal transport medium (Copan M40 414C Transystem™) with a separate flocked swab (Fig. 1), so that there were now only two arms; self-administered nasal swabs and HW-administered nasopharyngeal swabs, both using Amies with charcoal.
Those in the self-swabbing group were provided with nasal swab kits (M40 414C Transystem™). Healthworkers explained how to collect samples, and they provided a leaflet with diagrams explaining how to insert the swab about 2 cm into each nostril (anterior nares) and rotate 2-3 times, then place the swab into the tube until the swab reaches the end of the tube and seal (see Supplementary material). All households were provided with sample collection containers and leaflets for stool collection, and asked to store samples in a cool, dry place. Samples were collected from households the next day, stored in cool boxes with gel ice-packs during transit, and delivered to OUCRU and NIHE labs for immediate processing (within 24-h of collection). Those in the HW-administered nasopharyngeal swab group arranged a time for the health-worker to come back after the interview and collect swabs from all household members. The HW-administered swab tubes were taken back to the health centre and put directly into a cool-box with gel ice-packs and transported to the lab. Samples were cultured immediately on receipt in the laboratories. Due to the volume of samples, it was not feasible or costeffective to store leftover specimens.
Standard microbiology culture and identification techniques were used to analyse the swab contents for the presence of S. pneumoniae 32 . Nasal/nasopharyngeal samples were cultured overnight at 37 °C in 5% CO 2 atmosphere on 5% sheep blood agar supplemented with 5 mg/L of gentamicin to inhibit Staphylococcus aureus. Presumptive S. pneumoniae colonies were selected by colony morphology and α-haemolysis and subjected to optochin-susceptibility testing. MALDI-TOF (MBT Liberry 8468) to confirm the S. pneumoniae identification of optochin-susceptible colonies. Stool samples were cultured directly on two MacConkey agar No. 3 plates (CM0115 to inhibit enterococci growth) with 2 mg/L ceftazidime or 0.5 mg/L meropenem 33 . Plates were incubated overnight and large pink (lactose-fermenting) colonies were considered as positive for third-generation cephalosporin resistant Enterobacterales (C3GRE -proxy for ESBL) or CRE. Number of types of bacteria growing was based on the morphology. Statistical analysis. Bacterial recovery was assessed through comparison of recovery rates of S. pneumoniae between swabbing groups. Proportions of positive samples were calculated, with 95% confidence intervals. Recovery rates between swabbing and demographic groups were compared using logistic regression, with household as a random effect. Carriage rates are presented separately for children under 5-years and those aged 5-years and over. Feasibility was assessed through an internal review of the logistical procedures and storage, as well as qualitative interviews. We also calculated the cost per sample associated with each swabbing method, to allow cost comparisons. Costs were calculated as costs per household in each swabbing group and divided by the average number of household members. Costs were separated into swabs/swab packs, health-worker time costs, transportation, lab consumables, and lab staff time. Acceptability was assessed through refusal and withdrawal rates from the study logbook, as well as qualitative interviews. Results have been reported in accordance with guidelines for Strengthening the Reporting of Observational studies in Epidemiology (STROBE) (see STROBE checklist in supplementary files).
Qualitative data collection and analysis. In-depth interviews with households and health-workers that participated in the survey, were conducted in August 2018. Each household participant interview was held within a week after samples were collected in order to learn about participant experiences and perspectives and to inform choices about measures and methods for future community-based sample collection. We planned to interview 3-6 study participants from each arm, purposively selected to represent a range of ages, family structures (i.e. with or without children under 5-years), and educational backgrounds. Saturation can develop fairly rapidly for studies where the aim is simply to understand common perceptions and experience among a group of relatively homogenous individuals, and we expected this small sample size to be sufficient 34 . We also planned to interview all four of the health-workers who conducted the fieldwork. Interviews explored participants' and health-workers' perceptions, experiences, and attitudes towards swabbing and sample collection, as well as how they stored the samples.
Thematic analysis was performed using a general inductive approach with NVivo 11 35 . Participants' views were summarised according to the research questions about the feasibility, acceptability, and bacterial recovery for each method. Data were reviewed with the research areas in mind, but no a priori models were imposed. Lower order units of meaning were identified and clustered into themes. Within each theme, subtopics and different perspectives were looked for and then themes were reviewed for redundancy and to capture the essence of each category. Data were coded using these themes and quotes representing themes were selected and translated into English. Scripts were re-read at different stages of the analysis, to ensure that reporting remained true to the data.
Ethics approval and consent to participate. The

Results
Study population. We approached 389 households. In 62, the main respondent was not available for interview, and 3 refused, leaving 324 who agreed to participate, representing an 83.3% overall response rate. We collected data for 1502 individuals from these households. Faecal samples and nasal or nasopharyngeal swabs were collected from 1498 individuals (Fig. 1). 180 samples for C3GRE testing and 363 samples for CRE testing were rejected due to a lab error with growth media. Most nasal/nasopharyngeal samples transported in liquid medium were lost (345) due to batch failure, compared to only a few using gel-based medium 4 . Reference strains at different dilutions and delays also failed to grow in the liquid transport medium. Participant characteristics are shown in Table 1. Participants in each swabbing group were comparable, though slightly more stool and swab samples from children under 5-years, children with no education, and the poorest households were discarded, which may affect generalisability. Saturation for in-depth interviews was reached after six interviews with self-swabbing participants and three with HW-swabbing participants. Respondents were mainly farmers with secondary or high-school education, except one participant with college-level education. One respondent was male, eight were female , and family sizes ranged from one to seven persons. Three of the four health-workers involved in data collection agreed to be interviewed.
In unadjusted regression models, recovery was higher among those who were swabbed by health-workers (OR 2.06, 95%CI 1.07-3.96) ( Table 3). All age-groups over 10-years had lower carriage than children under 5-years, while labourers and those not working had lower carriage than farmers. Consistent with the age effect, children who were in primary or secondary school had lower carriage than those who were not yet in school. There was higher carriage in Period 3 (spring) (13.2%) and Period 2 (autumn) (12.1%) than in Period 1 (summer) (6.2%), but this difference was only significant for children under 5-years. Having an ARI in the last 2-weeks (OR 3.13, 95% CI 1.38-7.11) and using antibiotics for ARI in the last two-weeks (OR 3.13, 95% CI 1.29-7.57) or last month (OR 2.07, 95% CI 1.08 3.96) were associated with higher carriage. After stratifying by age, health-worker swabbing, having an ARI in the last 2-weeks, and using antibiotics for ARI were no longer associated with higher carriage. There was no pneumococcal vaccination in this population.
Bacterial recovery from stools. Enterobacterales were recovered from all samples that were tested. C3GRE were found in 1233 of 1318 (93.6%) tested stools samples and CRE were found in 17 out of 1135 (1.5%) tested stools samples. All samples containing CRE also contained C3GRE. Full details will be reported in a separate publication.
Feasibility of sample collection. Five main themes related to feasibility emerged through the qualitative interviews: workload for data collectors; sample collection procedures; concerns about quality; storage and transportation; and disruptions (Supplementary Table 1).
Workload for data collectors. Health-workers visited each household at least two times: one to introduce the study, collect consent forms and conduct the interview, and one to collect samples and/or administer sample collection. For both swabbing groups, additional visits might be required to obtain consent forms from all participating household members (Table 1). For the HW-swabbing group, it could take up to 5-6 visits to finish sample collection. As working adults and schoolchildren were typically away from home during the day, healthworkers might have to revisit late in the evening or very early in the morning to take their swabs. The estimated total amount of time health-workers spent on each household, including travel, gaining consent, and collecting samples, was 65 min for the two self-swabbing groups and 135 min for the HW-swabbing group. Time required per sample, per 100 samples and per 100 positive samples are shown in Table 1.
Sample collection procedures. All interviewees responded that the leaflet instructions were straightforward and easy to understand, particularly as the health-workers provided thorough explanations and reminders throughout the process. In the self-swabbing groups, it was not difficult for adults to perform the nasal swab, except for minor discomfort. With small children or elderly people, however, a few participants were less confident and asked health-workers to help. In practice, this meant that the health-worker offered instructions, but did not collect the sample themself. Swabbing children, even when performed by health-workers, was harder as some children got scared and refused to stay still. Other mothers did not have a problem swabbing their small children as www.nature.com/scientificreports/ they were used to similar tasks when taking care of the children's daily health and hygiene. In the health-worker swabbing groups, some participants reported discomfort with the nasopharyngeal swabs. Stool sampling proved to be a bigger issue for most interviewees. Disgust arose due to the smell and having to transfer stool from the paper bowl to the sample container. Health-workers found stool samples "very dirty" and "time-consuming" to collect. Timing was also a challenge, as participants had to ensure that the stool had been taken no more than 12 h before health-workers collected them, which was much more difficult to coordinate for young children and could therefore be delayed for several days.
Concerns about quality. Concerns about differences in quality between self-swabbing and HW-swabbing emerged. Health-workers felt that self-swabbing would reduce their workload, but they were concerned that self-taken swabs would be less reliable. They thought that community members might not swab properly, and do it "just like normal nose cleaning", leading to low-quality samples and low positive rates. This concern may have led to their willingness to swab participants in the self-swabbing groups if asked to. Though in practice this only happened on a few occasions.
Storage and transportation. Fieldwork was arranged to ensure that samples were transported to the lab as soon as possible after collection. Participants were instructed by health-workers to collect samples on the evening before the planned collection date, and store them in a cool, dry place overnight. All participants responded that they did this, however, this usually meant at room temperature, which ranged from 8 °C to 38 °C during the study period. They did not want to keep samples in their fridge due to fear of contaminating food from stool samples. Our laboratory logbooks showed that 97.9% (1,463/1,494) of samples arrived at the lab within 24-h of collection, in compliance with the manufacturer's instructions 36 . The remaining 31 were within 48-h (Table 1).
Disruptions. We planned to conduct the survey and collect samples over a 2-3 month period, however there were some factors that disrupted field-work. Most community participants were engaged in some form of agriculture, and seasonal activities such as harvests kept them busy and difficult to meet at certain times. Healthworkers also had competing priorities, with their own routine work to attend to. For example, vaccination campaigns stopped data collection for a week each month. We also stopped data collection for a few weeks while we investigated the problem with the liquid transport media.
Cost. Per swab, HW-administered swabs ($8.63) cost more than self-administered swabs ($7.26) to collect and culture, and this was mainly due to the cost of health-worker time, and the separate nasopharyngeal flocked swab (Table 1). However, given the higher bacterial recovery, the total cost required to obtain 100 samples positive for S. pneumoniae would be lower ($6300 for HW-administered compared to $7260 self-administered). Stool samples cost $3.97 to collect and culture, and $397 for 100 samples positive for Enterobacterales. www.nature.com/scientificreports/ Acceptability of sample collection. In general, participants were happy to take part, and confirmed their willingness to participate in a similar survey in the future. One elderly participant, having experienced pain with the HW-administered swab, was reluctant to take part again. Five main themes were identified related to motivation to participate in the study: sense of contribution; self-benefits; perceived effort of taking part; influence from others; and previous experience with health studies (Supplementary Table 2).

Sense of contribution.
For some interviewees, while the study had no or little direct individual benefit, they were willing to take part for the community's interest. One interviewee linked individual health to collective www.nature.com/scientificreports/ health, stating that any "epidemic" could not be suppressed by only one person. Health-workers said the financial benefit didn't offset the workload, but they were doing this because it was their responsibility.
Perceived self-benefits. Some participants were motivated by the perceived benefits for themselves or their families, for example, they were glad to be made aware of a new health problem through the study. Some thought this was a free health check-up, and they would get results if they had any disease, reflecting a misunderstanding about the study's objectives.
Perceived effort of participating. The perceived effort of taking part also influenced participation. Most interviewees found the level of effort acceptable. In the self-swabbing group, many participants stated that any adult could perform the swab and it was "mild", "just a bit of discomfort", or even "nothing to worry about". Swabbing of small children was also "not so difficult", especially when done by caregivers who were used to calming and persuading their children. By comparison, experiences with the HW-administered nasopharyngeal swab were described as "very painful" or "eye-watering", and it could take several tries to swab children. However, since this was performed by health-workers, it was not a burden on participants. In contrast, the effort perceived by health-workers was much higher: the more work health-workers took on, the less the participants had to do and the less time it took them.
Despite the complaints about stool sampling, some interviewees thought it was fine because they had been provided with all the equipment to collect samples hygienically. The clarity of instructions and access to healthworkers if they had further questions also made the process acceptable.
Influence from others. Some family members were initially reluctant to participate, but yielded to the insistence and persuasion of their wives/mothers/grandmothers, or the head of the household. One interviewee persuaded her children by saying that the local doctor's family also provided samples for this study. People could be influenced to participate by their good relationship with and appreciation of the local health-workers. Participants mentioned frequently counting on health-workers for health advice, and their trusting relationship was apparent from the way they addressed or talked to one another. They also appreciated the health-workers for their effort and persistence during the survey and contribution to the community in general.
Previous experience with health studies. A few participants mentioned their previous experience of participating in health-related studies, which facilitated their understanding and motivated them to participate in this survey.
Two main themes emerged about reluctance to participate: disgust at the idea of sampling stool; and negative perceptions and issues in understanding the research (Supplementary Table 2).
Disgust at the idea of sampling stool. Most participants felt disgusted and awkward about collecting and handling stool. This was mentioned as a reason for initial reluctance rather than refusal to take part altogether.
Negative perceptions and issues in understanding the research. Unwillingness to participate was also linked to problems with understanding the purpose of the study or what the samples would be used for. Some interviewees reasoned that certain people were harder to convince because they had not heard the health-workers' briefing or were not familiar with health activities. In some cases, the study was rejected or doubted due to a negative impression of research in general and aversion to the idea of becoming test subjects.

Discussion
The aim of the study was to inform future research on bacterial carriage and antibiotic resistance at the population-level. We explored the feasibility, acceptability, and bacterial recovery for sample collection in the community, and compared different methods of swabbing. Recovery of S. pneumoniae was 11.1% overall, and 26.2% among children under 5-years. Recovery was higher for health-worker administered swabs (13.7%) than selfadministered swabs (10.0%) (OR 2.06 (95% CI 1.07-3.96)). We found that it was feasible to collect swabs and stool samples in the community on a large scale. Issues that emerged for consideration in future studies were workload for data collectors, sample collection procedures, concerns about quality, storage and transportation, and disruptions due to agricultural work and competing health-worker priorities. The total cost of collecting and processing each sample was higher for health-worker collected swabs ($8.63) than for self-collected swabs ($7.26) and for stool ($3.97). But given the higher recovery rate, the total cost to obtain 100 positive swabs to use for subsequent susceptibility testing or sequencing would be less for HW-administered swabs ($6300) compared to self-administered swabs ($7260). We found that community members were willing to participate in the study, and only 3 households refused, indicating a high acceptability of taking part. Qualitative data highlighted themes related to sense of contribution, perceived self-benefits, perceived effort of participating, influence from others, and previous experience with health studies. Reasons for reluctance to participate included disgust with the idea of collecting stool samples, and negative perceptions of research. High recovery of third-generation cephalosporin resistant Enterobacterales (93.6%) is worrying, but validates the sampling procedures reported here. Future publications will report prevalence and determinants of resistance in S. pneumoniae and Enterobacterales in this population in more detail. Overall, our results were similar to a previous pneumococcal carriage study in central Vietnam, in which doctors collected nasopharyngeal swabs from both healthy children and adults in a household survey in Nha Trang, October 2006 17 . They found an overall percentage of S. pneumoniae carriage of 11%. However, the carriage among children under 5-years was higher than ours at 43%. Other community-based studies in Vietnam also found higher carriage among children under 5-years, ranging from 28.7 to 52% 8,[17][18][19][20][21][22][23][24] . However, most of these studies were conducted more than 10 years ago, before the rapid development and transition of Vietnam to a lowermiddle income country. The most recent of these estimates, based on health-worker collected nasopharyngeal swabs in 2019, reported lower carriage of 28.7% 21 . A 2014 meta-analysis looking at carriage of S. pneumoniae in healthy children under 5-years old also found higher carriage in less developed countries, with a higher pooled prevalence in low-income countries (64.8%) than lower-middle income countries (47.8%) 16 . In a 2007 Vietnambased study there were big differences in carriage between urban (26%), sub-urban (22%) and rural areas (60%). Together these findings suggest that the environment is important in determining S. pneumoniae carriage 22 .
Collection method is also important in determining bacterial recovery, and health-workers may better adhere to sample collection procedures and improve yield. But health-worker involvement also increases cost and time, and may be problematic in large-scale community studies. In many studies that use health-worker collected swabs, participants are invited to health facilities, or approached when presenting at antenatal or vaccination services. This reduces the burden on health-workers and allows samples to be plated immediately or refrigerated shortly after collection. However, in large-scale community surveys, inviting participants to come to the health centre can be complicated, may result in high refusal rates 17 , and potentially introduce biases. We collected samples during household visits, but this required health-workers to make many repeat visits to households, making this method more costly and time-consuming. To reduce the burden on sample collectors, we explored the feasibility of using self-collected samples. In our study, self-swabbing required fewer household visits and was generally found to be simple and easy to do for most participants. Comparison of health-worker swabbing and self-swabbing for COVID-19 and other respiratory pathogens shows that self-swabbing is acceptable and has comparable sensitivity and specificity 29,37,38 , though other reviews report lower sensitivity with self-swabbing from the same swab sites 39 .
Health-worker collected samples may also have resulted in higher bacterial recovery compared to self-collected samples due to higher pneumococcal density in the nasopharynx compared to anterior nares. There have been very few studies directly comparing nasal and nasopharyngeal swabs for detecting pneumococcal carriage in healthy people. A community survey in the UK found no significant differences in the recovery of S. pneumoniae between self-taken nasal swabs and HW-taken nasopharyngeal swabs 29 . A Canadian study found 95.2% sensitivity for nasal swabbing compared to nasopharyngeal swabbing for S. pneumoniae detection in children under 6-years, but only 52.5% sensitivity in those aged 6-15 years 40 . In our study recovery was lower or similar for self-swabbing in all age-groups. We know of no other studies in resource-limited settings that have compared these two methods. Most of the carriage studies in Vietnam used nasopharyngeal swabs 8,19 , while three older studies used nasal swabs 17,20,22 , so it is difficult to compare them directly. All were taken by trained healthcare workers, and recovery was similar (28.7-52% for nasopharyngeal swabs, and 35-49.4% for nasal swabs).
Another reason for higher recovery with HW-swabbing may be use of nylon flocked swabs rather than the cotton-tipped swabs used for self-swabbing. Nylon flocked swabs are more efficient than Dacron or rayon in recovering S. pneumoniae both in vivo and in vitro 41 . WHO expert working group recommends calcium alginate, rayon, Dacron, or nylon swabs, though experts also suggested that flocked swabs could improve sensitivity of detection, since they allow easier bacterial elution from swabs into STGG as well as higher yields of organisms 25 . There are no studies directly comparing yield with nylon flocked swabs and cotton-tipped swabs.
In the end, both arms of our study used the same transport medium, Amies with charcoal, whereas most previous studies in Vietnam have used STGG, the primary transport medium recommended by WHO. STGG is cheap, easy to make with ingredients that are commonly available 25 , and has been widely used in many carriage studies in different resourced settings with high recovery of S. pneumoniae 17,42,43 . There have been no field-based studies comparing STGG with other transport media to sustain S. pneumoniae viability at room temperature or  29 .
In this study, we used M40 Transystem™ Amies with charcoal media because it was available as a commercial swab kit and was easy to send to households for self-sampling. Transportation time and storage procedures are other factors that might have affected S. pneumoniae viability and differed between groups. The WHO expert group recommends that the swab be inserted in a closed tube with STGG transport media, placed in a cool-box or on wet ice and transported to the laboratory within 8 h 25 . According to the suppliers, of the M40 Transystem swabs, they are stable for up to 24 h at room temperature (20-25 °C) 36 . Two other studies in Vietnam had swab transportation times up to 12 h using charcoal transport media at a cool temperature, and still had high recovery of pneumococci 8,19 . In a community-based survey in the UK, transportation took 1-2 days by post or taxi, with no refrigeration 29 . Although 97% of our samples arrived at the lab within 24-h, as recommended by the manufacturer, there was also a recommendation that samples be stored in a fridge or at room temperature during transit (20-25 °C) 36 . Some of our study took place during summer when average temperatures in the area were far higher than recommended, with the lower daily temperature around 26 °C and the higher over 30 °C 45 . Even during cooler spring and autumn months, maximum daytime temperatures were often above 25 °C. In particular, self-collected swabs were stored at room temperature overnight, and recovery during summer months for this group was particularly low (4.7%). This is the largest community-based study to date comparing carriage rates and bacterial recovery of S. pneumoniae between self-swabbing and health-worker swabbing in a low-and middle-income country setting. This study provides important evidence for planning community-based carriage studies, including cost, logistics, and participant feedback about acceptability of different methods. The large, randomly-selected sample reduced possible selection biases, and data collectors were trained to reduce response bias. Lab analyses were conducted blind to the swabbing group. However, the limitations of this study include the loss of 349 samples, mainly in the liquid media groups, limiting the power of analyses looking at determinants of pneumococcal carriage. This also meant that we were unable to compare yield for the liquid and gel-based transport media. There were several methodological differences between self-swabbing and health-worker swabbing groups, including person who collected the sample, swab site, type of swab, and time to cold storage, and we don't know which of these was responsible for the observed differences in bacterial recovery. Because we simultaneously collected swabs and stool samples, we are unable to say which placed more burden on data collectors in terms of number of return visits to households, and data collectors made multiple visits to households even in the self-swabbing arm in order to collect stool samples. Although we reached saturation, the small number of in-depth interviews, and the conduct of the study before self-swabbing for COVID-19 became widely used may also limit the generalisability of qualitative results.

Conclusions
Sample collection for large community-based studies is feasible and acceptable to participants. Health-worker collected swabs were more costly and time-consuming to collect and caused some discomfort, but had higher pneumococcal recovery rates. Higher recovery means that the total cost for 100 positive samples was lower than self-collected swabs. Higher recovery from health-worker swabs may have been due to better adherence to the sampling procedure, nasopharyngeal sample, the flocked swab, the storage directly into a cool-box after collection, or a combination of these. Future studies will need to consider these trade-offs for population-based sample collection to estimate antibiotic resistance in commensal bacteria.

Data availability
Illustrative quotes form the qualitative data are provided in the supplementary tables. The quantitative datasets generated and analysed during the current study are not publicly available as this was not included in the consent process, but anonymised datasets can be made available from the corresponding author on reasonable request.